[tritonbench] Benchmarking without docker with single CPU thread #118

xuzhao9 · 2025-12-15T21:03:48Z

On AMD, we observed high variance when benchmarking in dind: meta-pytorch/tritonbench#726

On NVIDIA B200 runner, we observed high variance when benchmarking with multiple CPU cores: #130

We are pinning the job to single CPU thread to stabilize the benchmark result.

Test plan:
https://github.com/pytorch/pytorch-integration-testing/actions/runs/20983664412

.github/workflows/tritonbench.yml

huydhn

LGTM!

huydhn · 2026-01-14T21:52:38Z

Just FYI, with the docker-in-docker setup for multi-tenancy, here is the env the CI is using without using a Docker image https://github.com/meta-pytorch/pytorch-gha-infra/blob/main/multi-tenant/images/multi-tenant-gpu/Dockerfile

xuzhao9 · 2026-01-14T22:49:26Z

Yeah we met a few issues with dind on AMD, plus it seems --cpuset-cpus doesn't work in dind. We decide to move to non-dind for now.

xuzhao9 added 2 commits December 15, 2025 16:00

do not benchmark b200 on dind

067ca54

test benchmaring without docker

4ac24f0

meta-cla bot added the cla signed label Dec 15, 2025

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 21:04 — with GitHub Actions Failure

add test

893f1eb

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 21:19 — with GitHub Actions Failure

fix test

b108b24

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 21:23 — with GitHub Actions Failure

fix test

e9db6d9

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 21:27 — with GitHub Actions Error

setup cuda

25c5239

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 21:37 — with GitHub Actions Failure

xuzhao9 had a problem deploying to pytorch-x-vllm December 15, 2025 22:07 — with GitHub Actions Failure

reduce CPU threads

6b8abc1

xuzhao9 temporarily deployed to pytorch-x-vllm December 16, 2025 02:35 — with GitHub Actions Inactive

test on 8 gpu runner

82d2c73

xuzhao9 had a problem deploying to pytorch-x-vllm December 17, 2025 16:32 — with GitHub Actions Failure

b200 8gpu

abd4a2a

xuzhao9 temporarily deployed to pytorch-x-vllm December 18, 2025 04:43 — with GitHub Actions Inactive

xuzhao9 mentioned this pull request Jan 14, 2026

[tritonbench] fix tritonbench noise issue #132

Closed

xuzhao9 added 4 commits January 13, 2026 22:10

test dind

ab4631c

single core

469234e

single core

1ae9d43

single core

7c33d2d

xuzhao9 had a problem deploying to pytorch-x-vllm January 14, 2026 05:13 — with GitHub Actions Failure

disable dind

9097b6e

xuzhao9 had a problem deploying to pytorch-x-vllm January 14, 2026 05:17 — with GitHub Actions Failure

bugfix

5e2b7ac

xuzhao9 had a problem deploying to pytorch-x-vllm January 14, 2026 05:29 — with GitHub Actions Error

bugfix

7afe0ec

xuzhao9 had a problem deploying to pytorch-x-vllm January 14, 2026 05:30 — with GitHub Actions Failure

build triton

0532d00

xuzhao9 temporarily deployed to pytorch-x-vllm January 14, 2026 05:36 — with GitHub Actions Inactive

xuzhao9 changed the title ~~[wip][tritonbench] Test benchmarking without docker~~ [tritonbench] Benchmarking without docker with single CPU thread Jan 14, 2026

xuzhao9 mentioned this pull request Jan 14, 2026

B200 runner noise issue #130

Closed

xuzhao9 requested a review from huydhn January 14, 2026 16:12

huydhn reviewed Jan 14, 2026

View reviewed changes

.github/workflows/tritonbench.yml Outdated Show resolved Hide resolved

huydhn approved these changes Jan 14, 2026

View reviewed changes

move to main

80ba761

xuzhao9 merged commit 3e0093a into main Jan 14, 2026
1 check passed

xuzhao9 deleted the xz9/disable-dind branch January 15, 2026 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[tritonbench] Benchmarking without docker with single CPU thread #118

[tritonbench] Benchmarking without docker with single CPU thread #118

Uh oh!

xuzhao9 commented Dec 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

huydhn left a comment

Uh oh!

huydhn commented Jan 14, 2026

Uh oh!

xuzhao9 commented Jan 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[tritonbench] Benchmarking without docker with single CPU thread #118

[tritonbench] Benchmarking without docker with single CPU thread #118

Uh oh!

Conversation

xuzhao9 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

huydhn commented Jan 14, 2026

Uh oh!

xuzhao9 commented Jan 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xuzhao9 commented Dec 15, 2025 •

edited

Loading